AutoViz: Transforming Data into Visual Insights with a Single Line of Code in Python.¶

Introduction: Unleash the Potential of AutoViz: Effortlessly Visualize Any Dataset, Regardless of Size, in Just One Line of Code! Explore Enhanced Features for Rapid Dataset Quality Assessment.¶

AutoViz was created to make exploring data easier and faster. It's like a smart assistant for data analysis! Whether you're just starting or you're already a pro, AutoViz helps you see important patterns and trends in your data by creating cool charts and graphs effortlessly. It's especially great for beginners because it takes away the confusing stuff and makes everything simple. And even experts find it handy to discover new insights they might have missed. AutoViz is your go-to tool for quick and powerful data visualization!¶

Give it a try and see how simple and powerful automated visualization can be!¶

Step 1: Installation¶
In [ ]:
!pip install autoviz
Step 2: Import autoviz library¶
and use the following code to instantiate the AutoViz_Class.¶
AutoViz() function can also create charts in multiple formats using the chart_format setting:¶
  • If chart_format ='png' or 'svg' or 'jpg': Matplotlib charts are plotted inline, Can be saved locally (using verbose=2 setting) or displayed (verbose=1) in Jupyter Notebooks. This is the default behavior for AutoViz.
  • If chart_format='bokeh': Interactive Bokeh charts are plotted in Jupyter Notebooks.
  • If chart_format='server', dashboards will pop up for each kind of chart on your browser.
  • If chart_format='html', interactive Bokeh charts will be created and silently saved as HTML files under the AutoViz_Plots directory (under working folder) or any other directory that you specify using the save_plot_dir setting (during input).
In [11]:
from autoviz.AutoViz_Class import AutoViz_Class
%matplotlib inline
AV = AutoViz_Class()
df = AV.AutoViz('empdata.csv')
Shape of your Data Set loaded: (1470, 4)
#######################################################################################
######################## C L A S S I F Y I N G  V A R I A B L E S  ####################
#######################################################################################
Classifying variables in data set...
    4 Predictors classified...
        1 variable(s) removed since they were ID or low-information variables
        List of variables removed: ['EmployeeNumber']
To fix data quality issues automatically, import FixDQ from autoviz...
  Data Type Missing Values% Unique Values% Minimum Value Maximum Value DQ Issue
EmployeeNumber int64 0.000000 100 1.000000 2068.000000 Possible ID colum: drop before modeling process.
Attrition object 0.000000 0 nan nan No issue
Department object 0.000000 0 nan nan No issue
MonthlyIncome int64 0.000000 91 1009.000000 19999.000000 has 114 outliers greater than upper bound (16581.00) or lower than lower bound(-5291.00). Cap them or remove them.
All Plots done
Time to run AutoViz = 3 seconds 

 ###################### AUTO VISUALIZATION Completed ########################

Arguments for AV.AutoViz() method:¶

  • filename: Use an empty string ("") if there's no associated filename and you want to use a dataframe. In that case, using the dfte argument for the dataframe. Otherwise provide a filename and leave dfte argument with an empty string. Only one of them can be used.
  • sep: File separator (comma, semi-colon, tab, or any column-separating value) if you use a filename above.
  • depVar: Target variable in your dataset; set it as an empty string if not applicable.
  • dfte: name of the pandas dataframe for plotting charts; leave it as empty string if using a filename.
  • header: set the row number of the header row in your file (0 for the first row). Otherwise leave it as 0.
  • verbose: 0 for minimal info and charts, 1 for more info and charts, or 2 for saving charts locally without display.
  • chart_format: 'svg', 'png', 'jpg', 'bokeh', 'server', or 'html' for displaying or saving charts in various formats, depending on the verbose option.
  • max_rows_analyzed: Limit the max number of rows to use for visualization when dealing with very large datasets (millions of rows). A statistically valid sample will be used by autoviz. Default is 150000 rows.
  • max_cols_analyzed: Limit the number of continuous variables to be analyzed. Defaul is 30 columns.

Lets try other options.¶

1.You can set chart_format ='png' or 'svg' or 'jpg': Matplotlib charts are plotted inline, Can be saved locally (using verbose=2 setting).¶
In [ ]:
#Code:
df = AV.AutoViz('empdata.csv',chart_format ='png',verbose=2 )
In [8]:
from PIL import Image
img=Image.open('Heat_Maps.png')
img.thumbnail((500, 500), Image.ANTIALIAS)
img
Out[8]:
Likewise, we will get png images for heat mapbar plots, scatter plots , violin plots etc.¶
2.You can set chart_format='html', interactive Bokeh charts will be created and silently saved as HTML files under the AutoViz_Plots directory (under working folder) or any other directory that you specify using the save_plot_dir setting (during input).¶
In [13]:
df = AV.AutoViz('empdata.csv',chart_format ='html',verbose=2 )
#Output:
Out[13]:

Likewise, we will get html files for heat map, scatter plots , violin plots etc.

Conclusion:¶

In summary, AutoViz simplifies data analysis with easy and automated visualization. It's user-friendly for beginners, yet powerful for experts, offering quick insights and quality assessments. Whether you're new to data or an experienced analyst, AutoViz makes exploration efficient and insightful, ensuring a seamless journey to understanding your data.¶